ESTIMATION AND DETECTION OF SIGNALS 
IN MULTIPLICATIVE NOISE 
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We define a class of detection-estimation problems on matrix Lie groups 
in which the observation noise is multiplicative in nature. By examining 
the differential versions of the hypotheses, which are bilinear in 
nature, we are able to derive the relevant likelihood ratio formula 
and the associated optimal estimation equations for the signal given 
the observations and the assumption that the signal is present. These 
estimation equations are of interest in their own right, in that they 
represent a finite dimensional optimal solution to a nonlinear esti- 
mation problem and can be viewed as consisting of a Kalman-Bucy filter 
along with the on-line computation of the solution of the associated 
Riccati equation, which is driven by the observations. The usefulness 
of these results is illustrated via an example concerning the detection 
of an actuator failure in a rigid body rotational control system. 


*Decision and Control Sciences Group, Electronic Systems Laboratory, 
Department of Electrical Engineering, M.I.T., Cambridge, Mass. This 
work was supported in part by NASA under Grant NGL- 2 2-009- 124. 



I . Introduction 


Kailath [11, [2] and Duncan [3] have derived rather general 
likelihood ratio equations for the detection of signals in additive 
noise. These equations explicitly involve the optimal least squares 
estimate of the signal given the observations and the assumption that 
the signal is present. In general, the optimal signal estimation 
equations are infinite dimensional in nature, and thus the practical 
implementations of the estimation-detection equations in the general 
case necessarily involves suboptimal, finite-dimensional approximations. 

Thus it is of interest to find classes of signal and observation 
processes for which the optimal systems can be realized by finite- 
dimensional sets of equations. Of course the best known example of 
this type is the class of signals generated by linear systems driven 
by white noise and the class of observations that are linear in the 
signal and involve additive observation noise only. In this case, the 
optimal signal estimate is generated by a Kalman-Bucy linear filter 
[4], [5], and the detection equations can be implemented quite easily. 

Recently, there have been several papers [6]- [14] that point 
out that estimation problems for certain ( right- or left-invariant) 
bilinear observation processes can also be handled rather nicely. In 
this case, the tools of Lie theory [15]- [17] are of value in deriving 
equations for the optimal estimation system, which consists of a non- 
linear preprocessor followed by a linear filter. The extension of 
these bilinear estimation results to the detection problem was carried 
out by Lo [13], who obtained finite dimensional estimation-detection 
results for certain right-invariant bilinear observation .processes. 
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In this paper, we consider a somewhat different class of esti- 
mation-detection problems. As in Lo's case [13], our observation and 
signal processes evolve on certain matrix Lie groups, but in our case 
the observation noise enters multiplicatively . Such a model was first 
considered in [10] and [12] in relation to the estimation of the angular 
velocity of a rigid body. By considering the differential form of the 
observations, we are led to bilinear equations that differ from those 
of Lo [13] and those considered in [6] -[9] in a most significant way — 
our equations are neither left-nor right- invariant (unless the under- 
lying Lie group is abelian, in which case our results are essentially 
the same as Lo's). In this case, we cannot use the same trick that 
was so successful in [6]-[9], [11] , and [13], but motivated by the 
results in [10], [121, we are able to obtain nonlinear finite dimensional 
optimal estimation- detection equations that are most interesting in 
that they include a Kalman-Bucy filter whose gain must be computed 
on-line, using the incoming values of the observation process in the 
integration of the associated Riccati equation. 

In the next section we define several classes of processes on 
Lie groups, introduce our observation model, and compare it to the 
right- invariant bilinear model used in [6]- [9] , [11] , and [13]. Section 
III contains the derivation of the likelihood ratio for signal detection 
in multiplicative noise, and we also display the optimal nonlinear 
signal estimation equations. Several examples are included in Section 
IV. These include a problem formulation that may prove to be of value 
in detecting actuator and sensor failures in rigid body - inertial 


guidance systems. 
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II. signal and Observation Processes on Matrix Lie Groups 

As in [6]- [14], the basis for our generating random processes 
on matrix Lie groups is an injection procedure from the Lie algebra 
associated with the Lie group into the Lie group. This type of 
operation was first introduced by McKean [18] , [19] and later extended 
by Willsky and Lo [6]** [14]. Let G be an n — dimensional matrix Lie group 
of N x N matrices with associated matrix Lie algebra L (for the rele- 
vant properties of matrix Lie groups, see [14]- [17]). Let A 1 » A 1 '* ,, ' A n 

be a basis for L. Suppose we have an n-dimensional stochastic process 
x satisfying 

dx(t) = f(x(t),t)dt + G(x(t) , t) dw(t) ; x(0) given (1) 

where w is an m-dimensional Brownian motion process, independent of 
x(0) , with 

E [dw(t) dw' (t> ] = Q(t)dt (2) 

Following [6] , [13] , [18] - [19] , we inject x into G via the "product 
integral” in one of two ways: 

X. (t) - n exp [ l A dx (s)l (3) 

1 s<t L i-1 J 

X 0 (t) - n exp [ l A x (s)dsl (4) 

s<t L i«l J 

For the definition of the product integral and a discussion of 
the existence and properties of X 1 and X 2 , see [13] , [14] , [19] . We only 

note that X^^ and X 2 satisfy the stochastic differential equations 
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, n n n \ 

dX (t) ={ )>A.dx. (t) + l l [G(x(t),t)Q(t)6'(x{t),t)I .A A dtS X x (t) 
1 ( i=l 1 1 i=l j=l J 

(5) 

dX 2 (t) “[ I A^ttjdt] X 2 (t) (6) 


(here x. is the ith element of the vector x and D . . denotes the ijth 
1 

element of the matrix D) . In (5) and (6) we see the inherently bilinear 
nature of these equations. In fact, they define right-invariant bilinear 
systems. By reversing the order of the products in the discrete approx- 
imation to the product integral, we obtain left-invariant bilinear sy- 

(t)dtj ). 

As discussed in [103 , [11] , [14] and proven in [13], if we 
assume that x(0) is known, the processes x, X^, and X 2 are (almost 

surely) causally equivalent — i.e* knowledge of x — {x(s) j 0 <_ s t} 

is equi valent to knowledge of or Intuitively, this is clear, 

since we can write (X^Xj e G almost surely, which implies they are 

invertible a.s.) 


stems (e.g. dX 2 (t) « A i x i 


J] A.dx. (t) - [dX (t)]X “ 1 (t) " I I [G(x(t) ,t)Q(t)G' (x(t) ,t) ) A A dt 
i=l 1 1 1 1 i=l j=l J J 

(7) 


l A^CtJdt - [dX 2 (t) ]X 2 _1 (t) (8) 


we can recover x from X^ (assuming we can solve (7)) or X 2 because of 
the linear independence of the A^ (see [10], [13] for the details). 
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Thus we have the equivalence of the vector space process x and the Lie 
group processes X 1 and X^ in that, given any of them we can construct 

the others, although the functional relationships among X^,X 2 and x 

are, in general, quite complex (see tl0],[14]). 

We include the above general formulation to indicate the extent 
of the relationship between Lie group and vector space (Lie algebra) 
processes. For some further comments on and results for the general 
formulation, we refer the reader to 16], [13], and [14]. 

The value of the bij activity of the algebra - to - group injection 
procedure is great, especially for the special class of linear-bilinear 
processes — a setting in which we can solve detection and some esti- 
mation problems. In order to indicate this value, we will review a 
linear-bilinear problem formulation considered by Lo [13] (see also 
[6]- [12]). The extension of these techniques to the nonlinear case will 
be clear, although the general problem does not lead to finite dimen- 
sional solutions. 

Let x be a k-dimen9ional process satisfying 

dx(t) - F(t)x(t) dt + G(t) dw(t) (9) 

where w is an m-dimensional Brownian motion independent of the normally 
distributed initial condition with 

E [x(0) ] ”0 E [x (0) x(0) ' ] = P Q (10) 

E [w(t) ] » 0 E[dw(t)dw' (t) ] » Q(t)dt (11) 

Let c (t) be an n x k matrix of continuous functions. We now write down 
a pair of hypotheses on the Lie Group G: 
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H, _• Z(t) = O exp { l A [C(t)x(t)dt + dv(t)] } (12) 

IQ 4 -*’T X 


s<t i-1 

n 


H ft _: Z(t) * O exp 1 l A dv (t)l 
0G s<t l i=l ) 


(13) 


where v is an n-dimensional Brownian motion, independent of w and x(0) , 
with 

E[v(t)] = 0 E[dv(t)dv' (t)] - R(t) dt (14) 


R(t) > 0 


(15) 


or, in differential form, 


n n n % 

H : dZ(t) = { l A [C(t)x(t)dt + dv(t) ] . + l I R £i (t)A A dtS 

1G li«l 1 1=1 j-1 


Z(t) 


: dZ(t) =11 A dv. (t) + l l ^^(tJA A dti Z(t) 
'i=l 1 1 i=l j=l 13 J ’ 


H 


OG 


( 16 ) 


(17) 


Using the bijective property relating processes on the Lie alge- 
bra and the Lie group, we have the completely equivalent hypothese on 
the Lie algebra L: 

(18) 
(19) 


H : dz (t) = C(t) x(t) dt + dv(t) 
1L 


H __ i dz(t) - dv(t) 
OIj 


Note that if we identify the pair of hypotheses h ig and h il and the 
pair h oq and H QL , we have 


; (t) *» n exp i l a dz (t)| 
s<t 'i=l 


( 20 ) 
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For the Lie algebra problem, we have the standard linear-Gaussian 
estimation-detection result. That is, let [0, T] be the time 
interval of interest and let be the space of n-dimensional continuous 

functions on [0, T] that are zero at t — 0. Also, let 8^ be the Borel 


field on under the uniform topology and let y^ and y^ be the measures 
induced by y on under reS P ec i^ ve ^y • We i^ en 

have [1] - [3] that the likelihood ratio for signal detection is 


dy / , t 1 

= — <z) = exp f x* (t|t)C (t)R~- L (t)C(t)x(t|t)dt 

au 0 ' 2 l 

T , ) 

x'(t|t)C’(t)R i (t)dz(t)[> (21) 


where j denotes the Ito integral and 

x(t|t) = E[x(t) |z t ,H 1L l ( 22 ) 

This is computed by the following Kalman-Bucy filter [4], [5]: 

dx(t | t) » F(t) x(t | t) dt + P(t)C' {t)R _1 (t) [dz(t) - C (t) x(t| t) dt] 

(23) 

P(t) = F (t) P (t) + P(t)F'(t) - P(t)C* (t)R~ 1 (t)C(t)P(t) + G(t)Q(t)G' (t) 

(24) 

P(0) = P Q (25) 

Using the bijectivity of the injection procedure, we would expect the 
following, which is in fact proven in [13] : let C g be the family of 

continuous matrix-valued functions on [0,T] with values in G that are 
equal to I at 0, and let 8 be the Borel field on C under the uniform 

^ g g 
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topology. Denoting by and V Q the measures induced on by 

the process Z under the hypotheses H^ G and H^, respectively , we have 
the following representation of the likelihood ratio 


av / . a _ , _ 

LR = (Z) = exp J f x' (t|t)C' (t)R" i (t)C(t)x(t|t)dt 

+ y x' <t|t)C’ (t)R _1 (t)dz(t)| (26) 

0 


where 


x(t|t) = E[x(t) |z t ,H 1G l = x(t|t) a.s. 


(27) 


Here x(t|t) is computed as follows: 

dx(t 1 1) - F(t)x(t|t)dt + P(t)C' (t)R _1 (t) [dz(t) - C(t)x(t|t)dt] (28) 


where P is given by (26) ,(27) and we recover dz from Z and dZ from 


n n n 

I Adz (t) - [dZ(t))Z -i (t) - l l R <t)A A dt 
i=l 1 i=l j=l 3 3 


(29) 


Since Ag,...,A n _^ form a basis for L, we can write 


« 4 

B - l 
i-1 


v b e l, b ± e 7R 


(30) 


(see (13] for details) and thus 


dz 1 (t) •Hi A i dz i ( t)) '•••' ( J A i dz i (t) ) ] 


(31) 
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Thus, for this problem, the optimal estimation-detection system 
consists of a nonlinear preprocessor to recover dz from Z and dZ, followed 
by a Kalman-Bucy linear filter to compute x(t|t). This is followed by 
a system that takes x(t|t) and dz(t) as inputs and computes the LR from 
(21) or (26). Note that as mentioned earlier, it is crucial in the 
above development that dz have the right-invariant representation (16) 
or (17) . In fact, it is precisely this point that leads to the design 
of the nonlinear preprocessor followed by a linear filter with pre- 
compu table gains. We note that assuming that x(0) / 0 or that the Lie 
algebra hypotheses are 

H : dz (t) = f (t) dt + C(t)x(t)dt + dv(t) < 32 > 

1L 

H„ T : dz(t) = f (t)dt + dv(t) (33) 

OL 

where f is a deterministic term, causes no difficulty in the above 
analysis nor in the analysis described in the rest of the paper. Such 
a term can be thought of as a "carrier frequency" (see [6], [8]). 

As discussed in [10] and [12] , there are several physically im- 
portant problems, including some inertial navigation and optical com- 
munication applications, in which the observation noise process is 
inherently multiplicative in nature. Based on this physical motivation 
and the results in [10] and [12] , in the next section we formulate a 
multiplicative noise detection problem. As we shall see , this develop- 
ment will lead to bilinear equations that are neither right- nor left- 
invariant and for which the optimal detection-estimation system takes 
a rather striking form. The techniques introduced here are potentially 
useful in such problems as sensor, actuator, and plant failure 



- 10 - 


detection in linear and bilinear systems (which will be discussed in 
subsequent papers ; see also Example 1 in Section IV) . 

Let x,v, and C be as before, and let 

y (t) = C(t) x(t) (34) 

We inject y into G via the usual product integral 



r r 



Y (t) 

" n exp l A Vi 

(s) ds 

J 

(35) 


8<t L i“l 




r J 1 

r S i 


dY(t) 

= y l A i y i (t)dtj 

Y(t) = l A (C(t)x(t)) dt Y (t) 

L i=l J 

(36) 


We also inject v into G via a second product integral 

V(t) = II exp [ £ A.dv. (s)l (37) 

s<t *- i=l 1 1 J 

which is to be interpreted as corresponding to the left-invariant 
bilinear stochastic equation 

y- n n n 

dV(t) - V(t) l Adv (t) + l l R (t)A A dt (38) 

L i-1 i“l j«l 3 J 

We note that (37) corresponds to reversing the order of the products in 
the limiting expression for the product integral (see (10], [12]). 

In the next section we will consider a detection problem in- 
volving an observation process of the form 

M(t) = Y (t) V(t) (39) 

Here Y is the signal process and V should be interpreted as observation 
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noise. For physical motivation for the models (38) and (39) , see [10] , 
[12]. The differential form of (39) is 


r n 

dM(t) = J A. [C (t) x(t) 3 , dt 
i=l 1 




r n 

M(t) + M(t) l A.dv (t) 
L i=l 1 




n 


l 

i=i 



R. . (t)A.A.dt 
13 i j . 


(40) 


This is neither left-nor right- invariant (in fact, it is the sum of a 
left-invariant and a right- in variant term) , and thus, we cannot use the 
same nonlinear preprocessing trick that was successful in the previous 
problem in reducing the problem to a linear— Gaussian one . That is , in 
general, when we multiply through in (33) by M ^t), the right-hand 
side is not independent of M , so we cannot obtain the filter form of 
nonlinear preprocessor followed by a linear filter with precomputed 
gains. As we shall see in the next section, the optimum filter-detector 
is highly nonlinear in nature and possesses a rather distinctive form. 

In closing this section, we note that the observation process 
(40) is of the same form as that in (12) if the underlying Lie group 
G is abelian [15]. In this case, elements of L and G commute, which 
implies that right- and left- in variant bilinear systems are the same. 
Thus, our results will reduce to those of Lo [13] and Willsky and Lo 
[6]- [8] in the abelian case. 
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XII. Estimation-Detection with Multiplicative Observation Noise 

Let Y and V be given by (35) -(38). We define two hypotheses on G 

(41) 


H. : M(t) = Y (t) V (t) 
1G 


: M(t) = V(t) 


(42) 


or, in differential form, 


r~ n n r n n n 

H :dM(t) = l A.y. (t)dt M(t) + M(t) £ A dv (t) + l I R ± .{t)A 
1G L ± “ x i i -1 L i=l 1 1 i-1 j-1 1 


A.dtJ 


r n n n -i 

H 0Q sdM(t) = M(t) A i dv i (t) + l (t) A^dtJ 


(43) 

(44) 


The problem is to determine the likelihood ratio for these two hypotheses 
and to display the associated filtering equations that arise. 

As in the right- invariant case discussed in the preceding section, 
we will find it useful -to transform the hypotheses (41) , (43) and (42) , 
(44) into completely equivalent hypotheses on the Lie algebra via a 
particular bijective mapping. We do this as follows: multiply both 

sides of (43) and (44) on the left by M~ 1 (t) (which exists w.p. 1). 

Recall [15] , [16] . that if L is the matrix Lie algebra associated with the 
matrix Lie group G, then 

X _1 A XGL V A E L VXEG (45) 

Thus, we have the hypotheses on the Lie algebra L (almost surely) : 
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n n 

H. : M** 1 (t) tdM(t) ] ~ll 
1L i=l j=X 


- l A y (t)dtl 

L i=l -1 


n n 

H. : M _1 (t) [dM(t) ] - l l 
0L i-1 j=l 


R. . (t)A.A.dt 
1] l D 


n 

M(t) + l A.dv. (t) 
i-1 1 1 


n 



( 46 ) 


(47) 


which are easily seen to be completely equivalent to and H^, 
respectively, since the mapping from M fc to Z t , where Z is defined by 

. n n 

dZ(t) =M (t)[dM(t)] ~ l l R. . (t)A A.dt (48) 

i-1 j=l 3 3 

is seen to be a bijection of the same general type as that used in 
[13]. 

To simplify the hypotheses (46) , (47) , we coordinatize Z using the 

basis A,,..., A . That is, if we write 
I n 

n 

2(t) - l A Z (t) (49) 

i=l 

our Lie algebra hypotheses become 


L_s dz(t) = H(M(t) ,t)x(t)dt + dv(t) 
1L 

(50) 

t QL : dz(t) * dv(t) 

(51) 


where H(M(t) ,t) ie an n x k matrix that depends on M(t) (it is clear 
that the right-hand side of (46) is linear in x(t) , since y(t) * C(t)x(t)). 
This matrix can be computed as follows: write 

1 n 

M~ (t)A.M(t) « I Y (M(t))A. 

j=l 3 3 


( 52 ) 
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Then 


M _1 (t) 


A.y.U)] M(t) 


n n 

= l l Y i ,(M(t))A y (t) 
i=l j=l J J 


- I \ I Y ii (M(t))y ■ (t)l 
j=l L i=l J J 


A. 

3 


n 

= l [H(M(t),t)x(t)] A 

j=l 3 3 

where 

H (M(t) ,t) * r* (M(t))C(t) 

and the ijth element of T is Y^j* Note that if G is abelian , 

M _1 (t) A^M(t) = A i Vi 


(53) 


(54) 


(55) 


which implies that F - X and H = C, and thus, as discussed at the end 
of the preceding section, our hypotheses (50), (51) reduce to the right- 
invariant hypotheses (18), (19). However, in the general case H depends 
on M(t) . We refer the reader to [13] for a mechanization of a procedure 
for finding the coordinate functions y . The procedure involves simple 

linear algebra and is easy to mechanize, and thus for our purposes, we 
assume that we have two black boxes that can take dZ(t) and M(t) as 
their respective inputs and produce as outputs dz and H(M(t) ,t) , re- 
spectively (see the examples in the next section in which we explicitly 
display the details of these black boxes in several specific cases) . 

By the bijectivity-equi valence arguments discussed earlier, our 
detection problem now reduces to considering the two hypotheses (50) , 
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(51) with the knowledge that 2 fc and are causally equivalent (this 
is true because Z fc and M fc are, and z is simply a particular coordinati- 
zation of Z) . Thus, at time t, M(t) and therefore H(M(t),t) are 
conditionally known. Intuitively, we then see that (50) represents a 
"conditionally" linear observation process. Thus one would expect the 
optimal filter for x(t) given z' to be a Kalman-Bucy filter with the 
optimal gains and associated covariance computed on line using incoming 
values of M or z l In addition, it would then be natural to expect the 
likelihood ratio to have a form very similar to the ones in the previous 
section but again Incorporating incoming values of M or z. 

These intuitive ideas are, in fact, correct, and a discussion 
of how this result follows from several results in the literature is 
presented in the appendix. Using the equations in the appendix and 
noting that, since M(t) is a deterministic function of z ~ , we have 
H(M(t),t) = H(z fc ,t) , we can derive the optimal estimation-detection 
system depicted in Figure 1. The incoming observation dM(t) is inte- 
grated, and the value of M(t) , along with the known values of C(t) and 
R(t) are used to compute H (M(t) ,t) and dz(t) as described earlier. The 
conditional density for x(t) given z t is the normal density N(x;x(t|t), 
P(t|t)) where both the conditional mean x(tjt) and the conditional 
covariance P(t|t) are computed on-line using incoming values of M and 
z in the integration of the equations: 

dx(t | t) = F(t)x(t|t)dt + K(t|t) [dz(t) - H (M(t) ,t) x(t | t) dt] (56) 

x(0|0) = 0 (57) 

P(t|t) = F(t)P(t|t) + P(t|t)F'(t) + G(t)Q(t)G’ (t) - K(t|t)R(t)K* (t| t) 

(58) 




Fig. 1: Illustrating the Optimal Estimation-Detection System for Multiplicative Observation Noise 
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P(0|0) = P Q (59) 

K(t | t) = P(t|t)H' (M(t) ,t)R” 1 (t) (60) 

The likelihood ratio LR(t|t) for hypothesis H 1 over H Q , given observations 
up to time t, is then given by 


LR(t|t) =exp|-y J x' (s|s}H' (M(s) ,s)R“ 1 (s)H(M(s) ,s)Si(sls)ds 

0 

t , ) 

+ £ x*(s|s)H'(M(s),s)R (s)dz(s)| (61) 

0 

(here(H lf H 0 ) can be thought of as either < h 1G ' h og* or * H 1 l' H 0I? ' since 
they are equivalent) . 

The optimal estimation-detection system is quite distinctive in 
form, as it should be viewed as an optimal linear estimation system, 
augmented by the on-line integration of the nonlinear Riccati equation 
using the incoming values of the observations, followed by a likelihood 
ratio evaluation that again is identical in form to the usual linear- 
Gaussian one, but that also incorporates new values of the observations 
into the gains. It should also be noted that a simple example of a 
discrete time system for which the optimal filter has this type of form 
was reported by &strom [20, p. 236]. 

We note that as in [1] , by using the chain rule one can readily 
extend these likelihood ratio results to problems such as detection in 
colored as well as white noise. For instance, we can consider the case 
in which Y(t) may be generated in one of two hypothesized ways: 
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H q : dY(t) 



( 62 ) 


with x a k^-vector satisfying 

dx(t) = F 1 (t)x(t)dt + G L (t)dw(t) 


( 63 ) 


where F^G^, and C 1 are 9^ ven matrix functions of appropriate dimensions. 
The second hypothesis is 


K,t dY (t) « ( l A. [C,(t)5(t)l .dt) Y(t) 
1 V i=l 


( 64 ) 


with C a k 2 -vector satisfying 

d£(t) - F^t)£(t)dt + G 2 (t)dw(t) ( 65 ) 

where F ,G 0 , and C_ are also given matrix functions. 

in this case the likelihood ratio for H 1 and H Q given the obser- 
vation process 

M(t) * Y(t)V(t) (66) 

is obtained as the ratio of the LR for and H 2 and the LR for H Q 
and H 2 , where H 2 is the hypothesis 

H 2 : Y (t) = I (67) 

Thus, it is easy to see that the system that computes the desired LR 
consists of two linear filters with on-line gain computations — i.e. 
one filter estimating x assuming h q holds and one estimating £ assuming 
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It is clear that there are a number of variations on this theme 
e.g. we can hypothesize different V processes, etc. Thus, we see that 
the techniques developed in this section are potentially useful in 
identifying the underlying dynamics of the system under consideration 
and in detecting abrupt changes in the dynamics . Example 1 in the next 
section is a simplified version of a very important practical problem, 
and it indicates how our results may be applied. 

We now make a few comments on optimal Lie group estimation. The 
likelihood ratio formula (61) explicitly uses only the optimal (least 
squares, maximum likelihood, etc.) estimate of the vector-valued quantity 
x(t). Referring to the definitions of y (34) and Y (35), (36) , we see 
that we can directly compute the optimal estimate of y(t) : 

y (t | t) - C(t) x(t| t) (68) 

By identifying TR n with L via 


u 



(69) 


we see that we are essentially computing optimal estimates of processes, 
such as y, on the Lie algebra. What about the estimation of a Lie 
group-valued process such as Y? This is, in general, a very difficult 
(in fact, unsolved) problem, since there is no simple relationship 

between Y fc and y fc . In fact, the optimal estimate of Y(t) would seem 
in general to require smoothing our estimates of the entire trajectory 
y**, and even having this it is not clear what to dol These difficulties 
do not arise in the abelian case, which is studied in [6]- [9] and [14J , 
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and explicit solutions can be found in other special cases. A first 
result is reported in [10] and [11] , and further results will be pre- 
sented in later papers. 

Finally , in closing this section, we make a few comments about 
a slight generalization of the results of this section. So far we 
have assumed that A form a basis for L. Suppose instead we 

simply assume that A 1# ...,A n are linearly independent and generate L, 

which we assume is p (>n) -dimensional. Find A n+ ]/*-*' A p 80 that 

A ...,A form a basis for L. In this case, we must replace the 
1 P 

coordinitization in (49) by 

P > 

Z(t) = l A Z (t) < 

* l “ 


and the Lie algebra hypotheses (50) ,(51) become 

H, : dz(t) = H(M(t) ,t)x(t)dt + S(t)dv(t) 
XL 

H n _ : dz (t) * S(t)dv(t) 

OL 


(71) 

(72) 


where H(M(t) ,t) is now a p x k matrix (computed in precisely the same 
fashion) and S(t) is the p x n matrix given by 


S(t) 


L0 

|4-n-*-|-*-p-rr+ 


i 


( 73 ) 


In this case (71) , (72) include several perfect observations, and the 
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optimal estimation system becomes an observer-estimator [25], [26]. Thus, 
there are no conceptual difficulties introduced by this generalization. 

IV. Examples 

In this section we will present two examples illustrating the 
techniques developed in the preceding section. 

Example 1 ; Consider the Lie group SO (3) , consisting of all 3x3 

orthogonal matrices with positive determinat. Such a matrix can be 

3 

thought of as representing the orientation of a rigid body in TR 
i.e. it is a "direction cosine" matrix [27] representing the orienta- 
tion of an object with respect to atpossibly inertial) reference frame. 
The Lie algebra so (3) associated with SO (3) has the basis 


o 

o 

o 


■ o o r 


1 

0 

1 

H 

O 

0 0 -1 

A 2 = 

0 0 0 

A 3 = 

10 0 

Lo 1 o. 


1 

o 

0 

H 

1 

1 


1 

O 

o 

o 


Note that SO (3) is nonabelian. For further discussions of the properties 
and physical significance of S0(3) , we refer the reader to [10] , [121 , [14] , 
[27]- [29] . 

We now suppose that we have a stochastic process x in TR that 
is given by one of the two hypotheses 


Hq: dx(t) = f(t)dt + dw(t) 

(75) 

H i dx(t) * ^t + f (t) dt + dw(t) 

(76) 


where f(t) is a deterministic 3-dimensional time function, and £ is a 
random (constant) vector with normal distribution 


E(£) = 0 E(££') - P 


(77) 
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Also, x(0) is a normally distributed random vector, independent of 
with 

Elx(0)]«0 E[x(0)x’<0)] = P Q (78) 

and w is a three-dimensional Brownian motion process , independent of 
5 and x(0), with 

E[w(t) ] » 0 E[w(t)w*(t>] « Q ( t) dt (79) 

The x process is injected into SO (3) via the equation 

dX(t) = [ l A.x . (t)dtl X(t) (80) 

L i=l 1 J 

If we think of X as the direction cosine matrix of a rigid body, x 
has the physical interpretation of being an angular velocity vector, 
representing the angular velocity of the rigid body with respect to 
a reference frame (the coordinatization of these quantities depends upon 
the particular application; see [27] for details) . In this case we can 
interpret physically the two hypotheses: the term f(t) represents known 
torques that we apply to the body and the Brownian motion term repre- 
sents random disturbances. The random term £ represents a possible 
actuator failure in the control system of the craft — e.g. a jammed 
reactor jet on a spacecraft or a failed control surface on an aircraft. 
Thus , the problem of distinguishing between these hypotheses can be 
viewed as a failure detection problem. 

Before discussing the relevant observation process and associated 
detection system, we comment on the above dynamical model. Note that 
the angular velocity equations we have postulated are simpler than the 
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usual nonlinear Euler equations [27]. Equations < 75) and (76) or 
somewhat more complicated linear equations can be viewed as reasonable 
approximations if: (1) the rigid body is "nearly" spherically sym- 
metric; or (2) we linearize Euler's equations about a nominal (which 
might be included in the f(t) term); or (3) we make Q(t) large enough 
so that the nonlinear effects can be viewed as process noise. Also, 
there is no difficulty in considering more general linear dynamics — 
e.g. if we linearize about a nominal, or if £ is taken to be time- 
varying. 

We now describe the observation process of interest to us. We 
assume that X represents the relative orientation of the body with 
respect to inertial space, and we suppose that the rigid body is 
equipped with an inertial platform that is to be kept fixed in inertial 
space (see [27] for a detailed discussion) . Because of drifts in the 
gyroscopes used to sense rotation of the rigid body, the platform 
drifts relative to inertial space. As discussed in [12] and [14], a 
possible model for this drift is to take V(t) , the orientation of 
inertial space with respect to the platform, to be a left-invariant 
Brownian motion 

, 3 3 v 

dV(t) » V(t) < I A dv <t) +- l R (t)AAdtJ (81) 

l i-l i,j=l 3 3 ’ 

where v is a 3-dimensional Brownian motion, independent of x(0) , £, 
and w, with 

E[v(t)] ■> 0 E[v(t)v' (t)] = R(t)dt (82) 


R(t) > 0 


(83) 
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Our observation process in the orientation M(t) of the rigid body with 
respect to the platform, which can be determined by reading off girabal 
angles and is given by 

M(t) = X(t)V(t) (84) 

As discussed in the preceding section, the incremental change in M(t) 
is given by 

dM(t) = { l A.x (t ) \ M(t) dt 
l i=l 1 ’ 

, 3 ,3 s 

+ M(t) { l A dv (t) +- l R..(t)A A dt (85) 

*i=l 1 1 2 i,j=l 13 1 3 f 

(see [12], [14], and [27] for a discussion of how one obtains pulse-like 
or incremental information in such systems) . 

Performing the type of transformation used in the previous section 
(note that M _1 (t) » M' (t) a.s.) , we have 

1 V 

dZ(t) =M'(t)dM(t) - l R (t)A A dt 

i,j=l 13 1 3 

3 3 

= M' (t) l A X. (t)M(t)dt + l A dv (t) (86) 

i®l 11 i=l 

3 

Also, we obtain the following expression for z(t), the TK -coordinati- 
zation of Z, and its differential: 

z’(t) - [Z 32 ( t >' Z 13 < t >/ Z 2 i (t)J (87) 

dz(t) = H(M(t) )x(t) dt + dv(t) (88) 


where 



Having these expressions, we have the following equation for the like- 
lihood ratio for the two hypotheses: 


exp {" i/ V < S I S > H ’ (M(s))R _ 1 (s)H(M(s))x 1 (s|s) 


a s 


LR(t t) 


exp {' x 0 * (s|s)H' (M(s))R* 1 (s)H(M(s))x 0 (s|s)ds 

0 

t , X 

+ j x 1 * (s|s)H' (M(s))R J '(s)dz(s)| 


t , , 

+ £ x 0 ' {s|s)H' (M(s))R (s)dz(s)J- 


(90) 


where x^(t|t) is the conditional mean of x(t) given z , assuming 
holds, while to compute x^(t|t), we assume holds. The stochastic 
differential equations for these quantities are 


dx Q (t| t) 

- f (t)dt + K Q (t|t) tdz(t) - H(M(t))x Q {t|t)dtl 

(91) 

Vt|M 

- Q(t) - K Q (t|t)R(t)K 0 ’ (t 1 1) 

(92) 

K Q (t|t) 

- P Q (t 1 1) H 1 (M(t))R _1 (t) 

(93) 


dx 1 (t|t) 

ld?(t|t) J 


[f(t)+£(t|t) 
0 


dt + K x (t| t) [dz(t) - H(M(t))x Q (t|t)dtl 


( 94 ) 
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P x (t|t) = FP x (t|t) + P^tltJP' (t) + Q(t) + K (t|t)R(t)K ' (t|t) 


K-^tft) - P 1 (t|t)H(M(t))R“ 1 (t) 
Here P^ is a 6 x 6 matrix, and 
'0 I 


(95) 

(96) 


F = 


0 0 


Q O’ 

H = [H 

.0 0 . 


0] 


(97) 


We also note that sensor failure detection can be considered 
by hypothesizing several different forms for V. 

Example 2 1 Consider GL(2,TR), the group of 2x2 invertible matrices. 
Its Lie algebra consists of all 2 x 2 matrices and has the basis 


1 0 
0 0 


' A 2- 


0 1 
0 0 


' A 3 ” 


0 0 
1 0 


/ = 


0 0 

o 1. 

(98) 


Let x be the k-dimensional process satisfying 


dx(t) = F(t) x(t) dt + G(t) dw(t) ( 99 ) 

and let y be the 4-dimensional process 

y(t) - C (t) x(t) (100) 

We also take v to be a 4-dimensional Brownian motion independent of w with 
E(dv(t)dv‘ (t)) = Idt (101) 

We inject y and v into GL(2,TR) via 


dY(t) 




Y (t) dt 


( 102 ) 
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dV(t) = V(t) l A dv (t) + (A +A )dt 
L 11 ’ 

and define the two hypotheses 
H 1 : M(t) = Y(t)V(t) 

H q : M(t) » V(t) 

As discussed in Section III, we can define Z via 

dz(t) - M _ 1 (t)dM(t) - (A,+AJdt 

1 4 

and, defining the 4-vector 

z'(t) - (Z u (t),Z 12 (t),Z 2 i ( t) ,Z 22 (t)] 

the two hypotheses become 

H 1 : dz(t) = H(M(t) ,t)x(t)dt + dv(t) 
H q : dz (t) = dv(t) 


where we compute H(M(t) ,t) from 

H(M(t),t) * r' (M(t))C(t) 


where j , the ij element of F, is given by 


Y U (M) - (M _1 A i M) 11 


V i3 (M) - (M-\m) 21 

Y i4 («) 

instance, 


M 11 M 22 



.- 1 . 


i ' 12 


- 1 . 


l ' 22 


'll' 


MM -MM 
11 22 12 21 


(104) 

(105) 


(106) 


(107) 


(108) 

(109) 


( 110 ) 


(111) 


( 112 ) 
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Having these terms, we can apply the results of Section III to obtain 
explicit optimal estimation and likelihood ratio equations. 

V. Conclusions 

In this paper we have considered a class of optimal estimation- 
detection problems involving multiplicative observation noise. By 
considering the differential form of the observation process, we ob- 
tained optimal estimation and likelihood ratio equations that are quite 
interesting in that they are identical to those in the linear-Gaussian 
case except that the estimation error covariance depends on the ob- 
servations and thus must be computed on-line* 

We have noted that these results are potentially useful for on-line 
system identification and in the detection of failures or changes in 
system dynamics. This potentiality was illustrated by examining an 
actuator failure detection problem associated with rigid body rotations 
and inertial navigation systems. 



APPENDIX: The Computation of a Likelihood Ratio and a Conditional Density 

In Section III we were confronted with a signal detection problem 


of the form 

dz (t) = H(z t / t)x(t) dt + dv(t) (113) 

H 2 : dz (t) - dv(t) (114) 

where v is an n-dimensional Brownian motion 

E [v(t) ] - 0 E [dv(t) dv' (t) ] = R(t) dt (115) 

R(t) > 0 (116) 

and x is a k-dimensional process satisfying 

dx(t) = F(t)x(t)dt + G(t)dw(t) (117) 

Here x(0) is assumed to be a Gaussian random variable with 

E[x(0)] - x Q E[(x(0) - x Q )(x(0) - x 0 )’] = P Q (118) 

and w is an nr-dimensional Brownian notion with 

E (w(t) ] = 0 E[dw(t)dw' (t) ] ■ Q(t) dt (119) 


It is assumed that v,w, and x(0) are mutually completely independent. 

Also , we note that H is allowed to be a function of the past observations 

z fc , and we define the "signal" process 

s(t) = H(z t ,t)x(t) (120) 

We now note that future values of v(*) are independent of past 
values of z(*) and s(*)» Also, for the particular case of interest in 
Section III, it can be shown by a tedious but straightforward calcu- 
lation that if [0,T] is the time interval of interest, 
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j Es 2 (t)dt < 00 (121) 

0 

These facts enable us to use the vector version of the likelihood 
ratio formula derived in [2] . Let be the space of n-dimensional 

continuous functions on [0,T] with associated Borel field 8^ under the 

uniform topology. Letting and denote the measures induced on 

(0^,8^) by z under and H Q , respectively, we have 


LR - 



(z) 


where 



T 

t)R _1 (t)s(t|t)dt + ^ s'(tjt)R 1 (t)dz(t)| 


0 


( 122 ) 


s ( 1 1 1 ) = E [s (t) | z t ,H 1 ] = H(z t ,t)E[x(t) |z t ,H 1 ] = H(z t ,t)x(t|t) (123) 

Thus, it remains to derive a method for computing x(t|t). We 
first note that for any 0 < t, < t_ < . . < t < t, the variables 

,...,z^ and x(t) are not jointly Gaussian . However, as we shall 
i r 

see, the conditional density for x(t) given z fc i£ Gaussian with mean 

t 

and covariance that depends on z I To see this, we refer to the work of 
Kailath [21] , [22] , and Frost and Kailath [23] on the innovations 
approach to least squares estimation. In particular, in [23] a partial 
differential equation for the conditional density is derived. The 
derivation assumes the complete independence of v(«) and s(») which we 
do not have in our case. However, as Frost and Kailath [23] suggest, 
if we use the weaker innovations representation of Fujisaki, Kallianpur, 
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and Kunita [24], which requires only that future values of v<*) be 
independent of past s(*) and v(*> plus the integrability condition (121) 
(actually, a weaker condition will do) , we can obtain a partial dif- 
ferential equation of essentially the same form. That is, if we let 
p(x,t) denote the conditional density for x(t) given z t evaluated at 
x, we have 

dp(x,t) » L(p)(x,t)dt 

+ [x-x(t| t) ) *H’ (z t ,t)R _1 (t) [dz(t)-H(z t ,t)x(t|t)dt]p(x,t) 

(124) 

where L is the Fokker-Planek operator for (117) 

Up) (x,t) = -p (x, t) tr F (t ) ~ (x , t) J F( t) x 

2 

+ j tr rG(t)Q(t)G' (t)^-| (x,t)l (125) 

3x -* 

A straightforward computation shows that the solution to (124) is 

p(x,t) =N(xjx(t|t), P(t|t)) (126) 

where N(x?a,P) is the (multi-dimensional) normal density with mean a 
and covariance P, and we compute x(t|t) and P(t|t) from 

dx(t 1 1) - F(t)x(t | t)dt + P(t|t)H' (z t ,t)R _1 (t) [dz(t)-H(z t ,t)x(t|t)dt 

(127) 

x (0 | 0) « x Q (128) 

P(t|t) = F(t)P(t|t) + P(t[t)F' (t) + G(t)Q(t)G'(t) 

- P(t|t)H' (z t ,t)R" 1 (t)H(z t ,t)P(t|t) 


( 129 ) 
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P{0|0) = P Q (130) 

Note that P depends on z t . 

We also note that one can compute the infinite set of conditional 
moments of x(t) directly from the stochastic differential equations 
derived in [24], and one finds that the moments of N(x;x(t jt) ,P(t| t) ) 


satisfy these equations 
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